Text mining resources for the life sciences

نویسندگان

  • Piotr Przybyla
  • Matthew Shardlow
  • Sophie Aubin
  • Robert Bossy
  • Richard Eckart de Castilho
  • Stelios Piperidis
  • John McNaught
  • Sophia Ananiadou
چکیده

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

08131 Executive Summary -- Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives

Researchers in Text Mining and researchers active in developing ontological resources provide solutions to preserve semantic information properly, i.e. in ontologies and/or fact databases. Researchers from both fields tend to work independently from each other, but there is a shared interest to profit from ongoing research in the complementary domain. The relatedness of both domains has led to ...

متن کامل

Sustainable development and environmental challenges in Cameroon’s mining sector: A review

Cameroon has a strong geological potential for a number of mineral resources that, if well managed, could support economic growth. The country contains potentially large deposits of iron ore, gold, bauxite, diamond, limestone, nickel, and gemstones, and indices of other numerous minerals and precious metals. Despite its geological wealth, mining has never played a major role in Cameroon’s econo...

متن کامل

Positional Paper on a Semantic Web for Life Sciences

Our research primarily involves the application of natural language processing technology to biomedical literature in support of such applications as semi-automated functional annotation of proteins and genes, and gene name normalization for improved search and retrieval of text information. We have performed studies in the use of existing database resources in these efforts (Morgan, Hirschman ...

متن کامل

Competitive Intelligence Text Mining: Words Speak

Competitive intelligence (CI) has become one of the major subjects for researchers in recent years. The present research is aimed to achieve a part of the CI by investigating the scientific articles on this field through text mining in three interrelated steps. In the first step, a total of 1143 articles released between 1987 and 2016 were selected by searching the phrase "competitive intellige...

متن کامل

Text Mining and Management Tools for Resource Construction and Validation in the Life Sciences

In this talk, I am concerned with the question of what it really takes to move forward in the age of information, along the well known progression of human understanding, or data-information-knowledgetruth. After an overview of research directions of our group at KAIST, I present four text mining and management tools for resource construction and validation: Automatic gene summary generation, e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2016  شماره 

صفحات  -

تاریخ انتشار 2016